Noun Phrase Translations for Cross-Language Document Selection

نویسندگان

  • Fernando López-Ostenero
  • Julio Gonzalo
  • Anselmo Peñas
  • M. Felisa Verdejo
چکیده

This paper presents results for the CLEF interactive CrossLanguage Document Selection task at the UNED. Two translations techniques were compared: the standard Systran translations provided by CLEF organizers as baseline, and a phrase-based pseudo-translation approach that uses a phrase alignment algorithm based on comparable corpora. The hypothesis being tested was that noun phrase translations could serve as summarized information for relevance judgment without compromising the precision of such judgments. In addition, we wanted to have an indirect measure of the quality of our phrase extraction process, that had been previously developed for an interactive CLIR application. The results of the experiment con rm that the hypothesis is reasonable: a set of 8 monolingual Spanish speakers judged English documents with the same precision for both systems, but achieved 52% more recall using phrasal translations than using full Systran translations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Noun phrases as building blocks for cross-language Search Assistance

This paper presents a Foreign-Language Search Assistant that uses noun phrases as fundamental units for document translation and query formulation, translation and refinement. The system (a) supports the foreign-language document selection task providing a cross-language indicative summary based on noun phrase translations, and (b) supports query formulation and refinement using the information...

متن کامل

Experiments with a Noun-Phrase driven Statistical Machine Translation System

This paper presents a noun phrase driven two-level statistical machine translation system. Noun phrases (NPs) are used as the unit of decomposition to build a two level hierarchy of phrases. English noun phrases are identified using a parser. The corresponding translations are induced using a statistical word alignment model. Identified noun phrase pairs in the training corpus are replaced with...

متن کامل

Using Noun Phrase Heads to Extract Document Keyphrases

Automatically extracting keyphrases from documents is a task with many applications in information retrieval and natural language processing. Document retrieval can be biased towards documents containing relevant keyphrases; documents can be classified or categorized based on their keyphrases; automatic text summarization may extract sentences with high keyphrase scores. This paper describes a ...

متن کامل

Collocational Clashes in the Persian Translations of Tuesdays with Morrie

This study aimed at finding features of collocational deviations in the translations of Tuesdays with Mor- rie. In this direction, categories of collocations and collocational clashes, as well as causes of collocation- al clashes were explored. The present work investigated five Persian translations of the novel. All the books were examined completely and all possible collocational clashes were...

متن کامل

A Note on Mandarin Possessives, Demonstratives, and Definiteness

Yang (2004) observes that in Mandarin, an initial possessor phrase (PossessorP) may be followed by a bare noun as in (1), or by a possessee phrase that can be headed by a numeral and classifier, [Numeral + CL + N], as in (2) or by a demonstrative, [Dem + (Numeral) + CL + N] as in (3). (In all the examples in this section, we begin with Yang’s own initial glosses and translations. The interpreta...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001